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Abstract 

This study compared five density estimation techniques applied to samples 
from a population of 272,244 examinees' ACT English Usage and Mathematics 
Usage raw scores. Unsmoothed frequencies, kernel method, negative 
hypergeometric, four-parameter beta compound binomial, and Cureton-Tukey 
methods were applied to 500 replications of random samples of 500, 1000, 2000, 
and 5000 from these populations. The four-parameter beta compound binomial 
produced the most accurate estimates, and the kernel method yielded only 
slightly less accurate estimates. Cureton-Tukey ranked third in accuracy. 
All methods involving smoothing produced more accurate estimates than 
unsmoothed frequencies except the negative hypergeometric. Negative 
hypergeometric estimates varied erratically by test and score level. The 
methods studied have the potential to improve the estimation of norms and the 
equipercentile equating function. 



Key Words: Nonparametric density estimation, test score models, smoothing 
norms, equating 
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A Study of Methods for Estimating Distributions 
of Test Scores 

Statisticians have traditionally taken a parametric approach to 
estimating a probability density function from sample data: assume or try to 
deduce the function (e.g., binomial, beta, normal, Poisson) , then estimate 
function parameters from the sample statistics. Only recently have they 
actively cultivated a nonparametric approach (Silverman, 1986; Tapia & 
Thompson, 1978) involving few or no assuraptiotis about the function. Yet 
already one finds a considerable body of theory and methods of nonparametric 
density estimation. 

Nonparametric methods show promise for estimating test score 
distributions from sample data. Here we adapt one of them — the kernel 
method — to estimating discrete test score distributions of ACT English Usage 
and Mathematics Usage tests. Another nonparametric method, the Cureton-Tukey 
weighted moving average method (Cureton & Tukey, 1951), is also studied. We 
compare results by these methods to those from two parametric methods: the 
negative hypergeometric (Lord, 1965) and four-parameter beta compound binomial 
test score models (Keats & Lord, 1962; Lord & Novick, 1968, chap. 23). These 
methods have the potential to improve the estimation of test norms and the 
equipercentile equating function. 

Density Estimation Techniques 
Four techniques for estimating population densities are described, in 
addition to the sample relative frequencies. All of these techniques produce 
discrete density estimates. 

Negative Hypergeometric Distribution (Beta Binomial) 

The negative hypergeometric distribution was described by Keats and Lord 
(1962) and was discussed by Lord and Novick (1968). Lord and Novick (1968) 
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present a procedure for generating the distribution given the mean and the 
variance for a test of a given length. One way to derive the negative 
hypergeometric is to assume that proportion-correct true scores have a two- 
parameter beta distribution, ranging from 0 to 1, and that for examinees of a 
given proportion-correct true score, observed scores are distributed binomial 
with parameters equal to the number of items and proportion-correct true 
score. The observed score distribution over all examinees that results from 
this process is the negative hypergeometric. The negative hypergeometric 
distribution is often said to be the observed score distribution arising from 
the beta binomial model. 

The negative hypergeometric is a discrete unimodal distribution. If the 
mean proportion-correct score is below .5 then the distribution is positively 
skewed, and if the mean is above .5 then the distribution is negatively 
skewed. Keats and Lord (1962) and Lord and Novick (1968) showed that the 
negative hypergeometric can fit many test score distributions very well. 
Four Parameter Beta Compound Binomial Method 

To improve the fit to data. Lord (1965) generalized the beta binomial 
model. He used a four parameter beta distribution for proportion-correct true 
scores rather than a two parameter beta distribution. This four parameter 
beta distribution has parameters for the high and low proportion-correct true 
scores in addition to the two parameters used to describe the two parameter 
beta distribution. The low parameter is allowed to be greater than zero and 
the high parameter less than one. A lower bound for true scores that is above 
zero seems especially sensible for multiple choice tests, where an examinee 
can correctly answer a substantial proportion of items through random 
guessing. 
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In this model, Lord (1965) used a two-term approximation to the compound 
binoiAial distribution for observed scores given true score. Lord and Novick 
(1968, p. 525) suggested that the compound binomial may be more realistic than 
the binomial for this situation. Practically speaking, one major difference 
between the binomial and the two term approximation to the compound binomial 
is that the latter typically has smaller variance. The observed score 
distribution under this model is the four parameter beta compound binomial 
distribution. 

The four parameter beta compound binomial distribution is unimodal. It 
is more general than the negative hypergeometric. For instance, it can be 
positively skewed even if the mean proportion-correct score is above ,5. 

Lord (1965) presented a method for estimating the parameters of this 
distribution that is based on the method of moments, and the observed score 
distribution is computed analytically. In implementing the method of moments, 
sometimes the estimate of the high parameter exceeds 1. In such cases, the 
high parameter is fixed ?t 1.0 and the remaining three parameters are 
estimated by the method of moments. 
Cureton«*Tukey Estimati on 

Cureton and Tukey (1951) described a method in which the estimated 
relative frequency for a given score is found by taking a weighted average of 
the relative frequencies at that score and at surrounding scores. A method 
using seven relative frequencies in the averaging procedure was used here. 

A 

For a relative frequency at score x, f(x) , the smoothed relative 
frequency, fg(x) , is taken as [-2f(x - 3) + 3f(x - 2) + 6f(x - 1) + 

7f(ST+ 6f(x + 1) + 3f(x + 2) - 2f(x + 3)]/21 . According to Angoff (1982), 
these weights were chosen to preserve "the parabolic and cubic trends within 
successive sets of points" (p. 68). 
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This procedure sometimes produces negative relative frequencies near the 

extremes. When negative relative frequencies occur, they are set to zero. In 

addition, sometimes the weights are supposed to be applied to scores outside 

the range of possible scores on the tesc. For example, on a 40-item 

test, f(43) would be involved in finding f*(40) . It was assumed in this 

procedure that relative frequencies outside the range of possible scores were 

zero. The smoothing process sometimes results in iif* * I. For this reason, 

we define f (x) = f*(x)/ J f*(x) . 

^ x=0 ^ 

Kernel Estimation 

The kernel estimator was proposed by Rosenblatt (1956). The idea behind 
kernel estimation is to spread out the density of each observed score point 
using a probability density function. This probability density function is 
referred to as the kernel. The kernel estimator has been used most often with 
continuous data, and the normal distribution is often used as the kernel. In 
kernel estimation, a parameter is manipulated which controls the degree of 
smoothing. Silverman (1986) described in detail the use of the kernel 
estimator with continuous data. 

In this paper, a kernel estimator is developed for discrete raw test 
score distributions. This estimator uses a binomial kernel to produce a 
discrete density estimate. The parameter H is an even integer that is the 
binomial "number of trials" parameter. H is set by the investigator, and 
larger values of H result in more smoothing. The "probability of success" 
binomial parameter is .5. For a test with K items and an observed relative 
frequency distribution f(x) , x = 0, 1 K, this kernel estimator is 
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f (x) - I Bind - X + H/2lH, .5) f(i) , 
X = 0, 1, K, H is aa even integer, and 

Bin(ylH,.5) "(f^-^^U " •5)"-^, y= 0, 1, H (2) 

0 , otherwise. 

K 

Because Z fg(x) does not necessarily equal one, the estimator in Equation 1 
is adjusted, and this adjusted kernel estimator is 

A K 

yx) = f*(x)/ ^ f*(x) . (3) 

x=0 

To better understand Equations 1 and 2, first consider the special case 

when H = 0. In this case, Bln( 1 - x + H/2I0, .5) = Bin (i - xjo, .5). By 

Equation 2 , If 1 = x then Bln(0l0, .5) =■ 1, and If i x then Bln(OlO, .5) = 

0. Thus, for all x, when H = 0, f (x) = f(x) . That is, the observed 

s 

ralative frequency distribution is the kernel estimator when H =» 0. 

Now consider the case when H = 2. From Equation 2, Bin(o|2, .5) =» .25, 
Bin(l|2, .5) =« .50, and Bin (2|2, .5) = .25. All other values of Bin are 0 
when H = 2. From Equation 1 , if H =» 2 and i x, then i - x + H/2 1 and 
Bin(l|2, .5) = .50 . Similarly, if i - 1 = x or if i + 1 = x , then Bin =» 

.25. For all other values of i. Bin =» 0. Thus, f*(x) =« .25 f(x - 1) + 

s 

• • • A 

.50 f(x) + .25 f(x + 1), where f(x) is defined to be zero for x < 0 or 
X > K. Thio indicates that f^(x) can be written as a weighted sum of relative 
frequencies. 
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So far, we have suggested two interpretations of kernel estimators. One 
is that kernel estimators spread the density at each score point to other 
score points. The second is that the estimated density is a weighted sum of 
the observed densities. This second interpretation would suggest that, for 

discrete distributions, the Cureton-Tukey method presented earlier is similar 
to the kernel estimator. Actually, the only reason that the Cureton-Tukey 

estimator cannot qualify as a kernel estimator is because it uses negative 

weights. Both estimators are in the class of estimators described by 

Silverman (1986) as general weight function estimators. 

A hypothetical example of the kernel method with H =■ 2 is presented in 

Table 1. First, assume the test has 5 items and there are 10 examinees. The 

third column of the table shows the computations involved to estimate each 
fg by Equations 1 and 2. Note that the weight .5 is applied to the relative 

frequency at the point and .25 to the two adjacent points, which suggests the 

weighted sum of the relative frequencies interpretation of f . Now focus on 

s 

the .4 relative frequency at a score of 3. As can be seen, a relative 
frequency of .25( .4) = .1 is spread to scores of 2 and 4 and .5( .4) = .2 is 
kept at a score of 3, which suggests the spreading of density 

Interpretation. The adjusted estimates if the test would have had 4 items are 
shown in the rightmost column in Table 1. In this case, each relative 
frequency in the fourth column of the table was multiplied by 1/.925. 



Insert Table 1 about here 



Because a binomial kernel with a parameter of .5 is used, there is an 
Interesting relationship that involves repetition of the kernel procedure. 
Consider a situation where Equations 1 and 2 are applied first to the original 
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relative frequencies and then again to the smoothed relative frequencies. The 
resulting distribution will be the same as that which would have been obtained 
by applying Equations 1 and 2 once with H equal to the sum of the two H's in 
the repeated application. For example, using H = 4 will result in the same 
smoothed distribution as applying H = 2 twice. 
Illustration 

To illustrate the results produced by the methods, each was applied to a 
frequency distribution of ACT Mathematics scores based on 3,039 examinees. 
This test has 40 multiple-choice items. The results are shown in Figure 1. 
In this figure, the observed frequency distribution is represented by a solid 
curve and the fitted distributions by a dotted curve. 



Insert Figure 1 about here 



The negative hypergeometric appears to fit poorly. The fitted 
frequencies are too high at the very low scores and at middle scores above 20 
and too low at other score points. The observed distribution is positively 
skewed with a mean above .5, while the fitted distribution is nearly 
symmetric, which may be part of the reason for the apparent poor fit. 

The four parameter beta compound binomial appears to fit this 
distribution very well. The Cureton-Tukey fitted distribution is close to the 
observed distribution. However, it is not very smooth. This is a problem we 
have often noted with the Cureton-Tukey method. 

The kernel method is shown with H = 4, 8, 16, and 32. The distributional 
fit with H = 4 stays reasonably close to the observed distribution, although 
the fitted distribution is somewhat bumpy. As H is increased the fitted 
distribution becomes less bumpy, although it departs more from the observed 

10 

o 

ERIC 



10 



distribution. For H 16 and H = 32, the fitted frequencies are above the 
observed frequencies at the lower scores. 

Overall, the negative hypergeometric appears to fit this Mathematics 
distribution poorly, the four parameter beta compound binomial appears to fit 
very well, the Cureton-Tukey method fitted distribution is not very smooth, 
and the kernel method seems promising. 

Comparing the Methods 

Mathematics and English test score distributions from a recent October 
administration of the ACT Assessment to 272,244 examinees were used to compare 
the methods. The Mathematics test contains 40 five-alternative multiple 
choice test questions, and the English test contains 75 four-alternative 
multiple choice questions. 
Comparison Methodology 

The relative frequency distribution for the 272,244 examinees was 
considered to be the population density. The following procedure was used to 
evaluate the methods: 

1. Draw a random sample of size N from the population density f(x) , x = 

0, 1, K, and refer to this sample as replication r. 

2. Construct the observed relative frequency distribution f (x) . x =« 0 

r 9 

1 , . . . , K« 

3. Estimate the relative frequencies using each of the techniques 
described earlier, and refer to this estimated relative frequency as 

f^g(x), X = 0, 1, K. 
4« Repeat steps 1-3 R times. 
This process was repeated for N = 500, 1000, and 5000, each with R =» 500 
replications. The Cureton-Tukey, negative hypergeometric, four parameter beta 
compound binomial (4PB) , and kernel methods were used in step 3. 
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The following statistics were calculated at each x for each method, 
including the observed frequencies: 



2 

Bias « 

X 



£ f„(x)/R - f(x) 
r^l " 



Variance =» I 



r=l 



/R , and 



MSE = Z 
r=l 



f„(x) - f(x) 



/R = Bias + Variance 

X X 



(A) 



(5) 



(6) 



In addition, statistics over all score points were calculated as 



2 K 2 
Bias = Z Bias /(K + 1) , 
x=0 ^ 



K 

Variance = 2 Variance /(K + 1) , and 
x=0 ^ 



(7) 



(8) 



MSE = Z MSE /(K + 1) . 
x=0 ^ 



(9) 



The Equation 4 through 9 statistics are based on the estimation of relative 
frequencies, and can be viewed as adaptations, to discrete distributions, of 
the Integrated root mean squared framework for evaluating distributional fit 
described by Silverman (1986). 

A statistic based on relative cumulative frequencies also was used, 
because relative cumulative frequencies typically are the basis for 
calculating norms and for equipercentile equating. Only an overall statistic 
was calculated which is 



K-S = Z sup [F^g(x) 
r^l 



- F(x)]/R . 



(10) 
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The K-S statistic in Equation 10 is an adaptation of the Komolgorov-Smirnov 

A X A 

Statistic. For a given replication, F = 2 f (i) , which is the relative 

X " i=0 " 

frequency at x. F(x) = E f(i), the distribution function value at x. Sup is 

i=0 ^ 
the supremum over x. Thus, in Equation 10 the greatest difference, over score 

points, between the estimated relative cumulative frequency and the population 

distribution function is found for each replication, and the averaged over 

replications • 

Results 

Tables 2 and 3 compare results of applying the density estimation methods 
to samples from distributions 

Insert Tables 2 and 3 about here 



of ACT English Usage and Mathematics Usage raw scores. Figures 2 and 3 plot 
MSE X 10,000 shown in Tables 2 and 3 against method. 



InjJert Figures 2 and 3 about here 



The four-parameter beta-compound binomial (4PB) shows the lowest MSE and 
K-S statistics for both tests and all four sample sizes. The second lowest 
MSE and K-S statistic is associated with kernel estimates of varying degrees 
of smoothing. However, the Bias^ of 4PB exceeds that of unsmoothed, Cureton- 
Tukey, and kernel with low H. Thus 4PB owes its low MSE to low Variance 
rather than low Bias^. 

Results for the kernel method show that the optimal amount of smoothing 
varies according to test and sample size. In addition, there is a tradeoff 
between Bias and Variance. Namely, increased Bias^ accompanies reduced 
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Variance. For English, MSE decreases up to a binomial smoothing parameter ( 
of about 76 for samples of 500 and 1000. For samples of 2000 and 5000, the 
optimum K appears to be closer to 32. For Mathematics, the lowest MSE is 
obtained with H = 16 for samples of 500 and 1000, and H = 8 for samples of 
2000 and 5000. Although Variance continues to decrease with Increased H, 
Bias'^ continues to increase. 

MSE for the negative hypergeometric model shows a striking difference 
between the two tests. Estimation by this model yields much lower MSE for 
English than for Mathematics. 

2 

Figures 4 and 5 plot Bias^, Variance^, and MSE under the different 



Insert Figures 4 and 5 about here 



methods for a sample size of 1000. Bias for the negative hypergeometric is 
2 

lower than Bias^ for the other methods at most points of the score scale. 

Variance^^ shows a much more even pattern for smoothed frequencies. MSE 

2 ^* 
being the sum of Bias^ and Variance^, retains some of the bumpiness of 
2 

Bias^ , particularly for negative hypergeometric. 

MSEj^ of the kernel method tends to be slightly greater than that of 4PB 

except at very high English and very low Mathematics scores, where kernel 

2 2 
shows greater Bias^ . This greater Bias^ results from the kernel method's 

tendency to overestimate frequencies at the ends of the score scales. The 

overestimation increases as the smoothing parameter increases. 

Summary and Discussion 

The kernel and 4PB methods clearly do the best job of estimating the two 

score distributions studied. This result essentially agrees with Divgi's 

(1983) findings. He found a four parameter beta binomial model performed 
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better than a smoothed cumulative distribution function, two- and three- 
parameter beta binomial models, and a polynomial smoothing of the distribution 
function. The 4PB method shows slightly lower mean squared error (MSE^) than 
the kernel method over most of the score scales. 

One way to compare the methods studied here is on the sample size 
required to achieve equal levels of estimation error. Refer to Tables 2 and 
3. The MSE for the 4PB method at N = 500 is smaller than the MSE for the 
unsmoothed sample frequencies at N = 5000 for both English and Mathematics. 
Therefore, the use of the 4PB method has an effect on MSE that is similar to 
using the sample relative frequencies and increasing sample size tenfold. 
Note that the effect of the 4PB method on the K-S statistic is less drastic. 
From Tables 2 and 3, the 4PB method appears to be as effective in decreasing 
the K-S index as a two to two and one-half-fold increase in sample size. The 
kernel method for the H with the lowest MSE performed nearly as well as the 
4PB method. 

In planning a norming study a target value for estimation error often is 
stated and used in specifying the sample size required. The results of this 
study suggest that the sample size needed to meet the target estimation error 
may be lowered substantially by using the 4PB or kernel methods. 

Kernel MSE^ tends to increase at extremely high and low scores owing to a 
positive bias; estimated frequencies at the ends tend to be higher as the 
smoothing parameter increases. This bias merits concern, especially in 
relation to norms estimation. The adaptive kernel method (Silverman, 1986) 
shows promise for reducing such bias. This method changes the kernel function 
according to observed relative frequencies along the score scale. 

The Cureton-Tukey method failed to perform nearly as well as 4PB and 
kernel; nevertheless, it yielded an improvement over no smoothing, and 
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introduced little bias. Its ease and simplicity of computation make its use 
still worth considering. 

The erratic performance of the negative hypergeometric method prompts us 
to advise extreme caution in applying it. Under this method. Bias^* 

X 

fluctuated wildly along both score scales. Also, the Bias^ and MSE appear to 
depend greatly upon the particular shape of the population distribution: MSE 
for English remained within reasonable limits, but for Mathematics Usage MSE 
often far exceeded that of unsmoothed frequencies. 

In sum, all but one of the methods produced density estimates much closei 
on the average to population densities than did unsmoothed sample data. We 
expect such methods to find extensive application to future analysis of test 
score data.. 
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Table 1 

Hypothetical Example of Kernel Estimation with H = 2 



.25 f^^^ + .5f^ + .25f^^^ f^(K=5) f^(K=4) 



5 


.0 


.25( .0) 


+ 


.5(.0) 


4 


.3 


.25( .0) 


+ 


.5(.3) 


3 


.4 


.25(.3) 


+ 


.5(.4) 


2 


.2 


.25( .4) 


+ 


.5(.2) 


1 


.1 


.25(.2) 


+ 


.5(.l) 


0 


.0 


.25(.l) 


+ 


.5(.0) 



+ .25(.3) = .075 



+ .25(.4) - .250 .270 

+ .25(.2) = .325 .351 

+ .25(.l) = .225 .243 

+ .25(.0) = .100 .108 

+ .25(.0) = .025 .027 
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Table 2 



Fit of ACT English Usage Estimated Densiti 



es 



Measure 
of fit 
X 10,000 



Unsraoothed 
Sample 
Frequencies 



4 Parameter 
Beta 

Cureton- Negative Compound 
Tukey Hypergeometric Binomial 



H==2 



Kernel 



H=4 



H=16 



H=32 



H=76 



Bias2 
Variance 
MSE 
K-S 


.0005 
.2590 
.2595 
.0353 


.0005 
.0828 
.0833 
.0313 


.0200 
.0058 
.0258 
.0293 


.0013 
.0094 
.0108 
.0213 


.0003 
.0941 
.0944 
.0321 


.0003 
.0670 
.0673 
.0309 


.0006 
.0312 
.0318 
.0276 


.0011 
.0206 
.0217 
.0258 


.0038 
.0118 
.0156 
.0246 


Bias2 
Variance 
MSE 
K-S 

Bias2 
Variance 
MSE 
K-S 


.0002 
.1285 
.1287 
.0246 


.0003 
.0407 
.0410 
.0224 


.0194 
.0028 
.0222 
.0249 


.0011 
.0048 
.0058 
.0151 


.0002 
.0463 
.0465 
.0224 


.0003 
.0329 
.0332 
.0216 


.0005 
.0154 
.0160 
.0194 


.0011 
.0102 
.0114 
.0183 


.0039 
.0059 
.0098 
.0188 


.0001 
.0645 
.0646 
.0174 


.0003 
.0203 
.0206 
.0158 


.0193 
.0015 
.0207 
.0226 


.0010 
.0024 
.0033 
.0112 


.0002 
.0231 
.0233 
.0159 


.0002 
.0165 
.0167 
.0153 


.0005 
.0078 
.0083 
.0139 


.0012 
.0052 
.0064 
.0136 


.0041 
.0030 
.0071 
.0156 


Bias^ 
Variance 
MSE 
K-S 


.0000 
.0258 
.0258 
.0110 


.0003 
.0083 
.0085 
.0100 


.0192 
.0006 
.0198 
.0207 


.0009 
.0009 
.0018 
.0075 


.0002 
.0094 
.0095 
.0100 


.0002 
.0067 
.0069 
.0097 


.0005 
.0032 
.0037 
.0090 


.0012 
.0021 
.0033 
.0093 


.0042 
.0012 
.0053 
.0134 



or a given sample size, the lowest two MSE and K-S appear in"boldfIce. 
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Table 3 

Fit of ACT Mathematics Usage Estimated Densities 



Measure 
of fit 



Unsmoothed 
Sample 



Cureton- 



Parameter 
Beta 



X 10,000 


Frequencies 


Tukey 


Hypergeometric 


Binomial 


H=2 


H=4 


IXC L LiC X 

H=8 


H=16 


Bias^ 
Variance 
MSE 
K-S 


.0009 
.4841 
.4849 
.0344 


.0013 
.1470 
.1483 
.0303 


.2801 
.0168 
.2969 
.0466 


.0082 
.0348 
.0431 
.0229 


.0012 
.1690 
.1703 
.0306 


.0022 
.1185 
.1207 
.0291 


.0050 
.0806 
.0856 
.0274 


.0131 
.0532 
.0663 
.025?) 


Bias2 
Variance 
MSE 
K-S 

Bias^ 

Variance 

MSE 

K-S 


.0003 
.2392 
.2395 
.0240 


.0009 
.0724 
.0733 
.0213 


.2774 
.0083 
.2857 
.0426 


.0072 
.0178 
.0251 
.0166 


.0007 
.0835 
.0842 
.0214 


.0016 
.0584 
.0601 
.0204 


.0043 
.0398 
.0441 
.0193 


.0122 
.0265 
.0387 
.0190 


.0001 
.1192 
.1194 
.0170 


.0008 
.0366 
.0374 
.0150 


.2776 
.0044 
.2819 
.0408 


.0065 
.0090 
.0155 
.0126 


.0007 
.0419 
.0426 
.0152 


.0016 
.0295 
.0311 
.0145 


.0042 
.0202 
.0245 
.0140 


.0121 
.0135 
.0256 
.0151 


Bias^ 
Variance 
MSE 
K-S 


.0001 
.0474 
.0475 
.0107 


.0007 
.0149 
.0156 
.0095 


.2779 
.0016 
.2795 
.0395 


.0060 
.0035 
.0095 
.0088 


.0006 
.0170 
.0176 
.0096 


.0015 
.0120 
.0135 
.0093 


.0042 
.0082 
.0124 
.0095 


.0122 
.0054 
.0176 
.0129 



Jithin a given sample size, the lowest two MSE and K-S appear irboldface" 
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Fi£ure_l. Fitted distributions for an ACT Mathematics form. (Observed 
distribution represented by solid line. Fitted distribution 
represented by dotted line.) 
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^^g"^Q ^ (continued). Fitted distributions for an ACT Mathematics 
form. (Observed distribution represented by solid line. 
Fitted distribution represented by dotted line.) 
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Figure 2 . MSE of density estimation methods, ACT English Usage, 
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Figure 3. MSE of density estimation methods, ACT Mathematics Usage. 
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Figure 4 > Bias , Variance^, and MSE^ of ACT English Usage densities 

estimated from samples of 1000. (ONS - unsmoothed, NH - negative 
hypergeometric, KER - kernel, 4PB - four-parameter beta compound 
binomial, CT - Curet m-Tukey.) 
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Figure 5 . Bias , Variance , and MSE of ACT Mathematics Usage densities 

estimated from samples of 1000. (UNS - unsmoothed, NH negative 
hypergeometric, KER - kernal, 4PB - four-parameter beta compound 
binomial, CT Cureton-^Tukey . ) 
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